Evaluating speech features with the minimal-pair ABX task (II): resistance to noise
نویسندگان
چکیده
The Minimal-Pair ABX (MP-ABX) paradigm has been proposed as a method for evaluating speech features for zeroresource/unsupervised speech technologies. We apply it in a phoneme discrimination task on the Articulation Index corpus to evaluate the resistance to noise of various speech features. In Experiment 1, we evaluate the robustness to additive noise at different signal-to-noise ratios, using car and babble noise from the Aurora-4 database and white noise. In Experiment 2, we examine the robustness to different kinds of convolutional noise. In both experiments we consider two classes of techniques to induce noise resistance: smoothing of the time-frequency representation and short-term adaptation in the time-domain. We consider smoothing along the spectral axis (as in PLP) and along the time axis (as in FDLP). For short-term adaptation in the time-domain, we compare the use of a static compressive non-linearity followed by RASTA filtering to an adaptive compression scheme.
منابع مشابه
Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline
We present a new framework for the evaluation of speech representations in zero-resource settings, that extends and complements previous work by Carlin, Jansen and Hermansky [1]. In particular, we replace their Same/Different discrimination task by several Minimal-Pair ABX (MP-ABX) tasks. We explain the analytical advantages of this new framework and apply it to decompose the standard signal pr...
متن کاملAnalysis of features from analytic representation of speech using MP-ABX measures
The significance of features derived from complex analytic domain representation of speech, for different applications, is investigated. Frequency domain linear prediction (FDLP) coefficients are derived from analytic magnitude and instantaneous frequency (IF) coefficients are derived from analytic phase of speech signals. Minimal pair ABX (MP-ABX) tasks are used to analyse different features a...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملThe Relationship Between Acoustic Characteristics and Personality Dimensions in Patients With Dysphonia
Objectives: Voice is influenced by personality. However, it is still questionable which acoustic features are influenced by personality traits. This study aimed to investigate the relationship between acoustic characteristics and personality dimensions. Methods: Thirty-three participants with dysphonia and 33 participants without dysphonia were recruited to take part in this cross-sectional st...
متن کاملمقایسه روشهای مختلف یادگیری ماشین در خلاصهسازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت
In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...
متن کامل